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Genetic regulation of Nmx1 expression: 
an integrative cross-species analysis of 
schizophrenia candidate genes 

K Mozhui\ X Wang\ J Chen^ MK Mulligan\ Z U\ J lngles\ X Chen^ L Lu^ and RW Williams^ 

Neurexin 1 (NRXN1) is a large presynaptic transmembrane protein that has complex and variable patterns of expression in the 
brain. Sequence variants in NRXN1 are associated with differences in cognition, and with schizophrenia and autism. The murine 
Nrxn1 gene is also highly polymorphic and is associated with significant variation in expression that is under strong genetic 
control. Here, we use co-expression analysis, high coverage genomic sequence, and expression quantitative trait locus (eQTL) 
mapping to study the regulation of this gene in the brain. We profiled a family of 72 isogenic progeny strains of a cross between 
C57BLy6J and DBA/2J (the BXD family) using exon arrays and massively parallel RNA sequencing. Expression of most Nrxn1 
exons have high genetic correlation (r> 0.6) because of the segregation of a common trans eQTL on chromosome (Chr) 8 and a 
common cis eQTL on Chr 17. These two loci are also linked to murine phenotypes relevant to schizophrenia and to a novel 
human schizophrenia candidate gene with high neuronal expression (Pleckstrin and Sec7 domain containing 3). In both human 
and mice, NRXN1 is co-expressed with numerous synaptic and cell signaling genes, and known schizophrenia candidates. 
Cross-species co-expression and protein interaction network analyses identified giycogen synttiase idnase 3 beta (GSK3B) as 
one of the most consistent and conserved covariates of NRXN1. By using the Molecular Genetics of Schizophrenia data set, we 
were able to test and confirm that markers in NRXN1 and GSK3B have epistatic interactions in human populations that can jointly 
modulate risk of schizophrenia. 

Translational Psychiatry (20^^) 1, e25; doi:10.1038/tp.2011.24; published online 26 July 2011 



Introduction 

Neurexins (NRXNs) are neuronal cell adhesion proteins 
inserted mainly into the presynaptic membrane. They serve 
as bridging molecules that directly bind the presynaptic 
neurotransmitter machinery with postsynaptic transmem- 
brane proteins such as the neuroligins and neurotransmitter 
receptors.''"^ The NRXNs have an important role in regulation 
of presynaptic Ca^+ channels and neurotransmitter release, 
and are also involved in the modulation of postsynaptic NMDA 
and y-aminobutyric acid (A) receptor activity.^"^ 

There are three members in the NRXN gene family — Nrxn1, 
Nrxn2 and Nrxn3. They are among the largest of mammalian 
genes, Nrxn1 being close to 1.1 Mb and A/rxn2 about 1.7 Mb, 
and all are highly conserved across vertebrates. A 
fascinating aspect of the NRXNs is their enormously complex 
and extensive alternative splicing that can potentially produce 
about a thousand isoforms.^^ Each NRXN gene has two 
promoters and transcribes a long version (a-NRXN) and a short 
version ((3-NRXN). Multiple isoforms of both long and short 
variants are produced as a result of alternative splicing from five 
splicing regions within each gene.''^ '''' Deletion of the larger 
a-NRXNs leads to marked deficits in synaptic function. ''^ '"^ 

Among the NRXN genes, NRXN1 has been implicated in 
neurodevelopmental and psychiatric disorders. Genome-wide 



association studies have consistently found links between 
copy number variation (primarily deletions) in NRXN1 and 
autism and schizophrenia.''^"''^ Associations have also been 
found between sequence variants in NRXN1 and nicotine 
addiction and alcoholism. In mice, deletion of Nrxnia 
causes a decrease in excitatory synaptic transmission in the 
hippocampus, and behavioral changes including enhance- 
ment of motor learning and decreased prepulse inhibition 
(PPI) to acoustic startle — an endophenotype that models 
sensory-gating deficits in schizophrenia.^^ 

These studies in human and mice have established an 
important role for NRXN1 in the etiology of neuropsychiatric 
diseases. However, disorders such as schizophrenia and 
autism are inherently polygenic and emerge from the interplay 
of multiple sequence variants and environmental factors. A 
high-priority candidate such as NRXN1 is bound to operate 
within the context of a wider network of genes and is likely to 
involve non-linear and epistatic interactions that contribute to 
disease risk. Despite the importance of epistasis (gene-gene 
interactions) in disease process, current large-scale genetic 
studies rarely go beyond linear additive models.^"* Those that 
do apply a network approach to genome-wide association 
studies^^"^^ rely on known protein interactions and pathways, 
and are essentially constrained by prior knowledge. 
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Here, we ask whether we can use gene co-expression 
analysis in the mouse to define and refine genetic and 
molecular networks for Nrxn1 that provide insights into 
neuronal processes and possible disease mechanisms. To 
what extent can we translate from mouse to human at the level 
of genetic networks? Specifically, can data from a large family 
of mouse strains reveal novel partners for Nrxn1 that are 
relevant to human disease? The Nrxn1 locus is highly 
polymorphic in the BXD family and contains several single- 
nucleotide polymorphisms (SNPs) and insertions and dele- 
tions. Applying a reverse complex trait approach, we have 
treated these natural sequence variants and corresponding 
variations in transcript abundance as a starting point in the 
dissection of downstream phenotypes.^^ We used expression 
QTL (eQTL) analysis to evaluate the genetic regulation of 
Nrxn1 and to build an interaction network around Nrxn1 that 
includes other genes and also key behavioral phenotypes. 
Finally, we used human data extracted from the Molecular 
Genetics of Schizophrenia (MGS) study^° to evaluate the 
translational relevance of the murine analysis and to test a 
specific hypothesis about an epistatic interaction between 
NRXN1 and glycogen synthase kinase 3 beta (GSK3B). 

Materials and methods 

Mouse strains and resources. The BXD inbred strains 
were derived by crossing the parental strains G57BL/6J (B6) 
and DBA/2J (D2) followed by >20 generations of 
inbreeding. ^"""^^ The parental lines differ at approximately 
4.8 million SNPs and at another 500 000 insertions and 
deletions and copy number variations and the family as a 
whole is highly polymorphic.^"* The BXDs are a densely 
phenotyped and genotyped family and have been used as a 
genetic reference panel for over 30 years. A compendium 
of over 2000 phenotypes and over 45 microarray data sets 
for these strains is available from GeneNetwork (http:// 
www.genenetwork.org). 

For this study, transcript measurements were taken from 
the hippocampi of 70 BXD strains, the parental B6 and D2, 
and reciprocal F1 hybrids (B6D2F1 and D2B6F1). Details on 
the strain, sex and age of each animal can be accessed from 
http://www.genenetwork.org/dbdoc/UMUTAffyExon_0209_ 
RMA.html. The mice were bred at the University of 
Tennessee Health Science Center or at the Jackson 
Laboratory, Bar Harbor, ME, USA. Hippocampal dissections 
were carried out at UTHSC or at the Beth Israel Deaconess 
Medical Center by our colleague, Dr Glenn D Rosen (details 
on dissection and tissue processing are in Supplementary 
methods). All animal procedures were approved by the 
Animal Care and Use Committees at UTHSC and BIDMC. 

Microarray data processing. To provide a comprehensive 
analysis of natural variation in gene expression in the 
mouse hippocampus, we have previously generated trans- 
criptome profiles using conventional Affymetrix arrays 
for -^100 strains of mice.^^ In this study, we used 
exon arrays (Affymetrix Exon 1.0 ST, Affymetrix, http:// 
www.affymetrix.com) to provide a more detailed but still 
global analysis of the genetic control of alternative splicing in 



the hippocampus. The Affymetrix Exon 1 .0 ST arrays contain 
a total of 1 .2 million probe sets and uses at least four probes 
to interrogate an exon. Probe level intensity values were 
extracted from the CEL files using the GeneChip Operating 
Software provided by Affymetrix. Probe set level 
summarization and data normalization was done within 
GeneChip Operating Software using the Robust Multichip 
Averaging method. The expression values were then log- 
transformed and stabilized to a mean of 8 and an s.d. of 2 
across the samples. The entire data set (BXD hippocampus 
mRNA UMUTAffy Hippocampus Exon (February 2009) 
RMA) can be downloaded from GeneNetwork data sharing 
site using the accession number GN206 (http://www. 
genenetwork.org/share/data/). 

Array data analysis on GeneNetwork. QTL mapping was 
done for the probe set level data using QTL Reaper.^^'^^ The 
results are presented as likelihood ratio statistic (LRS) scores 
(LOD score = LRS/4.61). We used 3795 informative 
microsatellite and SNP markers in the BXDs for interval 
mapping. The BXD genotypes can be downloaded from 
http://www.genenetwork.org/share/data/. Up to a million 
permutations were performed to compute the genome-wide 
suggestive and significant thresholds. Nrxn1 is located on 
chromosome (Chr) 1 7 between 90.432 and 91 .492 Mb (mm9, 
build 37) on the minus strand and is targeted by 172 probe 
sets. For the probe sets that targeted exons in Nrxn1 and 
mapped as cis eQTL, we screened the probe sequences for 
overlapping SNPs that may confound the QTL analysis. 
Although Nrxn1 is SNP rich, most exons are devoid of SNPs. 
With the exception of exons 5 and 17 (each harbors a 
synonymous mutation), the exon targeting probe sets do not 
overlap SNPs and the consistent cis effect is reliable. 

For the co-expression analysis in the mouse hippocampus, 
pair-wise Pearson product moment correlations were com- 
puted for all probe sets based on expression variation across 
the 74 lines of mice. The top 1000 probe sets correlated with 
Nrxn1 were filtered by expression level. Only probe sets with 
expression > 8 on a log2 scale, and well-annotated probe sets 
that map to known genes were retained for subsequent 
analysis. Co-expression results are presented for a repre- 
sentative probe set (Affy 4930721) targeting a middle exon 
(exon 12) that has high expression, is associated with both 
eQTLs on Chrs 17 and 8 (see Results section), and is well 
correlated with other Nrxn1 exons. Analyses using 
other exons from the same correlated block of exons give 
similar results. We also used a publicly available human 
forebrain expression data (neocortex and hippocampus) 
generated using the Affymetrix U133 Plus 2.0 arrays (GEO 
accession number GSE5281).^^''^° This data set contains 
samples from Alzheimer's disease cohort and age-matched 
control subjects. We computed Pearson product correlations 
across 71 samples from normal subjects and retrieved the 
top 1000 covariates of NRXN1 from the pool of over 54 000 
probe sets. 

RNA sequencing. Total RNA was isolated from the 
hippocampus of adult female B6 and D2 mice. Ribosomal 
RNA was depleted using the RiboMinus eukaryotic kit (http:// 
www.invitrogen.com). We prepared multiplexed bar-coded 
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fragment libraries using the SOLiD Fragment Library 
Barcoding kits. Emulsion PGR was then done and libraries 
were amplified and sequenced on the ABI SOLiD 3 plus 
system (http://www.appliedbiosystems.com). 

We generated a total of 77 million and 80 million 50 
nucleotide RNA sequencing (RNA-seq) tags for 86 and D2, 
respectively. Of these, approximately 58 million and 63 million 
reads aligned to the mouse reference genome (NCBI37/mm9) 
using the ABI Bioscope software (http://www.solidsoftware- 
tools.com). The alignment results (BAM files) were uploaded 
to the Partek Genomic Suite (http://www.partek.com) and 
GeneNetwork's UCSC Genome Browser mirror (http://ucsc- 
browser.genenetwork.org/) for visualization. Number of se- 
quence tags was counted for each exon feature within Nrxn1. 
The entire data can be downloaded from GeneNetwork's high- 
throughput sequence data sharing site at http://galaxy.gene- 
network.org:8080/library. Additionally, read alignments can 
also be visualized on the GeneNetwork Genome Browser. 

QTL analysis for behavioral traits. To identify behavioral 
phenotypes that may be downstream of Nrxn1 we used a 
point-wise P-value of 0.01. We selected phenotypes that 
correlated with marker rs6313030 (located on Chr 17 at 
90.460122 Mb within the Nrxn1 gene) and then analyzed 
behavioral traits assayed in the BXDs using marker 
regression. The behavioral phenotypes with QTLs on distal 
Chr 17 are (1) prepulse inhibition, assayed by McCaughran 
etal^^ in a panel of 21 BXD strains (trait ID on Genenetwork 
is 10396), (2) anxiety trait measure by time spent in open 
quadrant of zero-maze, assayed in a larger panel of 57 BXD 
strains"^^ (trait ID 1 1696) and (3) handling induced convulsion 
as an index of ethanol withdrawal severity, measured in 25 
BXD strains^^ (trait ID 10065). 

Gene-gene interaction analysis. To explore epistatic 
associations between NRXN1 and GSK3B, we first 
extracted all markers in NRXN1 and GSK3B and in the 
20 kb flanking regions from the MGS data set.^° GAIN and 
non-GAIN data sets^° were combined and markers common 
to both were used. Association tests and minor allele 
frequency (MAP) were calculated using PLINK 2.050"^"^ and 
markers with minor allele frequencies ^0.05 were selected 
(384 in NRXN1 and 17 in GSK3B). Sporadic missing 
genotypes in these markers were imputed by MDRdt Beta 
VO.4.3 (http://www.epistasis.org). Multifactor dimensionality 
reduction (MDR) method was used to conduct gene-gene 
interaction analyses (MDR 2.0 beta 8.1). The details of MDR 
are described elsewhere.^'^'^^''^^ We used two methods to 
further filter the markers. One is the odds ratio filter 
implemented in the MDR software, which uses the case 
control ratio in the data set to define high-risk and low-risk 
genotype combinations.'*^"'*^ Using this approach, the top five 
markers from each gene were selected. The other filter was 
based on single marker association P-values in which tagged 
SNPs^^ with association P^0.15 in both genes were selected 
(a total of 13 markers in the two genes). SNPs selected by 
both methods were used for MDR analyses and best models 
were generated based on a 10-fold cross-validation 
consistency. In this study, we limited the search to a 2-loci 
and 3-loci interaction model using SNPs from the first and 



second filters, respectively. The significance of the best model 
was evaluated by 10000 permutations (MDRpt 1.0 beta 2).^° 

Additional bioinformatics tools. Gene ontology and 
pathway enrichment analysis was done using DAVID 6.7 
(http://david.abcc.ncifcrf.gov).^'''^^ Protein-protein network 
analysis was done using the protein interaction database 
and network tool available on GeneMania (http:// 
www.genemania.org). The protein network was based on 
known physical interactions curated from 114 sources, and 
the network was weighted by biological process. 

Results 

Genetic regulation of Nrxn1 exons. Nrxn1 has 
widespread expression in the brain. We used microarrays 
to measure its transcript at exon level resolution in the 
hippocampus of 74 members of the BXD family. Additionally, 
we measured expression in parental strains using RNA-seq. 
The very deep coverage provided by the arrays and RNA- 
seq uncovered significant inter-exon variability in Nrxn1 that 
may reflect alternative isoforms (Figure la). For example, 
low expressing exons (exons 2, 6, 11 and 19) are located 
within known splice sites^°'^^ and the RNA-seq also showed 
a potentially skipped exon (exon 4 of Refseq gene 
NM_020252.3) (Figure lb). To evaluate how tightly 
expression of exons is coupled with one another, we 
generated a correlation matrix of the exons and UTRs 
based on expression variation across the BXD family 
(Figure 1c). For the most part, exons form highly correlated 
expression blocks indicating that they are regulated as a 
single unit. The few exceptions are exons 1 , 3, 1 1 , 21 , 22 and 
the UTRs, which show poor correlation with most of the other 
exons and may be alternatively spliced. 

Nrxn1 expression is also highly variable across the strains 
with exons varying from 1.5-fold to 10-fold in abundance 
(Figure la). On average, the exons show a 2.5-fold variation 
among the BXDs. To track down genetic sources of this 
variation, we carried out eOTL mapping using conventional 
linkage analysis tools and genotypes embedded in GeneNet- 
work. The UTRs and canonical exons of Nrxn1 are partly 
controlled by polymorphisms near the gene itself (so-called cis 
eOTL) and partly by frans-acting polymorphisms (Figure lb). 
A tightly correlated set of exons map as a cis eOTL to distal 
Chr 17 within 0.5 Mb of Nrxn1 itself. These exons are also 
strongly co-modulated by a trans regulatory QTL on mid- 
Chr 8. Several exons that are putatively subjected to splicing 
(for example, exons 1, 8, 11 and 21) are not modulated by 
either the cis or Chr 8 trans loci. Additionally, there are trans 
eOTLs on other Chrs associated with a minor set of exons 
(Supplementary Figure SI). For example, exons 12, 16 and 
19 have trans eOTLs in mid-Chr 2 and exons 1 , 3, 1 1 and 22 
have trans eOTLs on Chr 1 2. These findings indicate a level of 
isoform specificity and joint eOTL control that corresponds 
to the complex splicing signatures in the exon co-expression 
matrix. 

Sequence polymorphism in Nrxn1 between C57BL/6J 
and DBA/2J. The maternal parent of the BXD strains, B6, 
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Figure 1 Expression of neurexin 1 (Nrxn1) exons in mouse hippocampus, (a) Exon-level expression of Nrxn1 was measured by direct sequencing of mRNA (upper bar 
chart; error bars are s.d.) and by using exon arrays (lower box plots). Expression values are on a log2 scale (y axis). There is significant variation in expression of exons among 
the BXDs. On average, expression varies 2.5-fold within exons. Based on RNA sequencing, exons 2 and 6 have low expression and exon 4 is skipped (there are no array 
probe sets for these), (b) The gene model depicted is based on RefSeq gene NM_020252.3. Location of the 5' UTR for ^-Nrxn1 is shown in intron 17 (arrow) and known splice 
sites are indicated above (1 -5). Approximate locations of insertion and deletion (indel) variants that are larger than 1 0 bps are shown below (lines for deletions and triangles for 
insertions in D2). Expression QTLs on chromosomes (Chrs) 8 (trans) and 17 (cis) are shown as heat maps for each exon and UTR. Allele from the DBA/2J parent has the 
positive effect on Chr 8 (red) while allele from the C57BL/6J parent has the positive effect on Chr 17 (blue), (c) The correlation matrix displays pair-wise correlations between 
the UTRs and exons in the BXDs. The lower cells are Pearson product moment and upper cells are Spearman ranked correlations. Exons 5-20 form a tightly correlated 
expression block (exon 11 being the exception). Correlation between the distal and proximal parts of the gene is generally low. 



was the first mouse genome to be sequenced and is used 
as tine mouse reference genome. Tine paternal strain, D2, 
has recently been resequenced at high density using 
next-generation systems (brief description of the 100x 
resequencing of the D2 genome is provided in 
Supplementary methods). This now makes it possible to 
carefully review all sequence variants in and around Nrxn1 to 
help identify variants that are candidates for the c/s-effect on 
expression. We found numerous sequence differences 
between the B6 (B) and D2 (D) alleles. There are over 
2900 SNPs in Nrxn1. Of these, only nine are in the 3' UTR, 
two are synonymous mutations in exons 5 and 17, and the 
remaining SNPs are in introns. In all, 15 SNPs in introns have 
potentially high impact based on overlap with predicted 
functional elements in conserved regions of introns^"^ 
(Supplementary Table SI). The introns also contain a large 
number of insertions and deletions (defined as insertion or 
deletion in D relative to B). There are 436 deletions and 340 



insertions ranging in length from 1 to 241 8 bp (Figure lb and 
Supplementary Table SI). Although none of these overlap 
canonical exons, such structural variations can have 
significant effects on gene expression. We selected a 74- 
bp insertion and a 107-bp deletion located in intron 17, which 
harbors the alternative promoter for the p-variant^°'^^ and 
confirmed the variants by PGR amplification (Supplementary 
Figure S2). 

Genetic analysis of Nrxn1 co-expression networlc. To 

elucidate a co-expression network associated with the Nrxn1 
transcript, we ranked the top 1000 correlates of Nrxn1 
expression from the pool of 1.2 million probe sets. These 
represent transcripts that have coordinated expression with 
Nrxn1 and are potential partners in cellular functions. These 
top covariates (Pearson product r> 10.61, P<3.7x10~^) 
correspond to 572 known genes (Supplementary Material B). 
Consistent with the synaptic involvement of Nrxn1, this set of 
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genes is enriched in synapse-related transcripts and genes 
known to directly interact with Nrxn1 (for example, Cask and 
Nlgn1) (Supplementary Material B). More interestingly, 
there is also enrichment in genes that are part of the WNT, 
axon guidance, mitogen-activated protein kinase and ErbB 
signaling pathways (for example, Gsk3b, Akt1 and Nrg3) and 
this suggests that there is close interaction between Nrxn1 
and downstream cellular signaling. We also used the Genetic 
Association Database (http://geneticassociationdb.nih.gov) 
and were able to extract a set of 20 schizophrenia 
candidate genes (Figure 2a) in our top 572 Nrxn1 
covariates (Benjamini corrected enrichment P< 0.001). 

eQTL mapping of the co-expression network members 
detected regulatory loci (that is, trans eQTL hotspots) on (1) 
mid-Chr 2 (67-77 Mb), (2) proximal Chr 6 (2-20 Mb) and (3) 
mid-Chr 8 (50-85 Mb). The first principal component (PC1) 
summarization of these highly correlated transcripts, also 
called an eigengene,^^ retained this whole-genome eQTL 
pattern. This is illustrated by the eigengene for the tightly 
correlated schizophrenia network: the PC1 of the 20 
transcripts accounts for 55% of the total variance and maps 
as suggestive QTLs to Chrs 2, 6 and 8 (Figure 2b). PC1 of the 
Nrxn1 exons (accounting for -^50% of the total variance) has 



significant QTLs on Chrs 8 and 17, and suggestive QTL on 
Chr 2 (Figure 2c). Although exons 7, 8, 11 and 12 of Nrxn1 
map as weak trans eQTL to proximal Chr 6 (Supplementary 
Figure SI), this association does not reach the suggestive 
threshold in the summarized map. Based on the comparison 
of eQTLs, Nrxn1 is not likely to be a QTL hotspot for the 
expression traits. Instead, the co-expression network is 
modulated mainly by eQTL hotspots on Chrs 2, 6 and 8. A 
portion of the variance in Nrxn1 comes from these trans- 
effects that connect Nrxn1 to the network. 

Mouse behavioral phenotypes that map close to 
Nrxn1. We then asked whether mouse phenotypes that 
may be concordant with human traits are associated with the 
Nrxn1 locus. GeneNetwork maintains a database of over 
2000 phenotypes collected from the BXDs. We specifically 
searched this database for behavioral traits that have linkage 
with markers near Nrxn1. We found three such traits that 
have significant point-wise QTLs on distal Chr 17. First, 
PPI to acoustic startle mapped to distal Chr 17 with a 
point-wise P= 0.0003 (LRS = 16) (Figure 2d). Second, 
anxiety phenotype assayed on the zero-maze"*^ mapped to 
distal Chr 17 with point-wise P= 0.005 (LRS = 10.5). The 
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Figure 2 Overlap of co-expression and behavioral QTLs. (a) An enriched set of twenty schizophrenia candidate genes was extracted from the covariates of neurexin 1 
(Nrxn1) in the mouse hippocampus. These transcripts form a highly interconnected co-expression network with Nrxn1. (b) The co-expression network is modulated by 
expression QTLs (eQTLs) on chromosomes (Chrs) 2, 6 and 8. This is illustrated by the QTL map for the first principal component summarization (PCI ) of the 20 schizophrenia 
candidates. The x axis represents Chrs 1 to X and the y axis shows the LRS score. The horizontal lines indicate the genome-wide suggestive and significant thresholds, 
(c) PCI of the Nrxn1 exons has significant QTLs on Chrs 8 and 17, and suggestive QTL on Chr 2. Location of Nrxn1 in the cis eQTL is depicted by the triangle. Mouse 
behavioral traits that map close to the Nrxn1 locus at significant point-wise P-values (P<0.01) are prepulse inhibition (d), anxiety trait (e) and ethanol response (f). 
Overlapping QTL locations are highlighted. 
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anxiety trait also has striking match with the Nrxn1 exons in 
genome-wide QTL pattern with additional QTLs on mid-Chr 8 
(LRS = 14; point-wise P= 0.0007) and Chr 2 (LRS of 8.4; 
point-wise P of 0.01) (Figure 2e). Finally, handling-induced 
convulsion as an index of alcohol withdrawal severity"^^ 
mapped to distal Chr 17 with a point-wise P of 0.007 
(LRS = 9.6) (Figure 2f). 

We note that the behavioral traits also have QTLs that are 
distinct from the QTLs associated with Nrxn1 and its co- 
expression network. The associations between the QTLs and 
multi-level phenotypes presents a complex genetic structure 
and the effect of Nrxn1 on the behavior traits can either be 
directly mediated by the cis eQTL, or more likely, is 
transmitted via the co-expression network with major con- 
tributions by the set of trans eQTLs. 



Cross-species analysis of tlie Nrxn1 covariates. We next 
examined if correlations at the mRNA level are preserved 
across species. Our aim was to identify strong and consistent 
correlations that are conserved in both mice and humans and 
that may be novel functional and genetic associates of 
Nrxn1. We used a human forebrain expression data set^^ '^^ 
and extracted the top 1000 covariates of NRXN1 (Pearson 
product r>l0.55l, P<3x10~^). This corresponds to 767 
unique genes (Supplementary Material B). Of these 767 
genes, only 42 overlap with the 572 co-expressed genes 
in mice and have conserved correlations with NRXN1 in 
human forebrain as well as in the mouse hippocampus 
(Supplementary Table S2). This includes GSK3B and 
NTNG1, which are both part of the Nrxn1 primed schizo- 
phrenia network derived from the mouse covariates. ^^'^^ 



To explore protein interactions that may be encrypted in the 
mRNA co-expression set, we extracted a protein-protein 
interaction network from the list of 767 transcript covariates 
(Figure 3a). In all, 261 out of the 767 query genes (34%) 
formed a highly interconnected protein network that incorpo- 
rated most of the conserved transcript correlates of NRXN1 
including GSK3B, and also several other schizophrenia 
candidates. GSK3B has diverse roles in cellular signaling in 
the brain^^"^^ and we examined the expression covariance of 
GSK3B and NRXN1 in a number of different mouse 
and human brain regions and found generally strong correla- 
tions between the two (Supplementary Figure S3). The 
strong genetic correlation suggests interactions between 
NRXN1 and GSK3B and places NRXN1 as a synaptic gene 
that may also function as a gateway to cellular signaling 
cascades. 

Test for epistasis between NRXN1 and GSK3B in liuman 
scliizoplirenia. We hypothesize that there is close 
functional relation between NRXN1 and GSK3B. We tested 
for gene-gene interactions in human using the MGS data set 
that has 2255 schizophrenia cases and 2641 controls. ^° We 
found significant 2-loci interaction between rs4563262 
(NRXN1) and rs4340737 (GSK3B) with a permuted 
P=0.012 (Figure 3b). Using a 3-loci model, there was 
significant interaction between rs6736816 and rs9309200 in 
NRXN1, and rs9826659 in GSK3B (permuted P= 0.0373) 
(Table 1). No covariate sex effect was found. Among the top 
five SNPs in the two genes, we also found that rs4340737 
in GSK3B and rs9309200 in NRXN1 are the only two SNPs 
that showed significant associations with schizophrenia 
(P= 0.01 71 for rs4340737 and P= 0.0293 for rs9309200). 
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Figure 3 Deriving protein-protein and genetic interactions from co-expressed genes, (a) The co-expression network for neurexin 1 (NRXN1) in the human brain was 
organized based on l<nown protein-protein interactions. This resulted in a single large network that connected 261 out of 767 co-expressed transcripts. The network 
incorporates several known schizophrenia candidates (shaded green) and several conserved covariates of NRXN1, that is, are correlated with NRXN1 in both human and 
mouse (shaded orange). NRXN1 and glycogen synthase kinase 3 beta (GSK3B) are part of this protein network (shaded red), (b) Test for epistasis using the multifactor 
dimensionality reduction method found significant interaction between rs4563262 in NRXN1 and rs4340737 in GSK3B. Distribution of cases (left bar in cell) and controls (right 
bar in cell) stratified by multilocus genotype. Each multifactorial cell is labeled as 'high risk' or 'low risk'. Average balanced prediction accuracy for the model was 51 .85%, 
permuted P= 0.0123. 
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Table 1 Multilocus epistatic interaction models between NRXN1 and GSK3B 
No. of loci Best model 



2 rs4563262 (NRXN1) 
rs4340737 (GSK3B) 

3 rs673681 6, rs9309200 (NRXN1) 
rs9826659 (GSK3B) 



Abbreviations: GSK3B, glycogen synthase kinase 3 beta; NRXN1, neurexin 1. 



Evidence from mouse to prioritize potential candidates 
for scliizoplirenia. Our study demonstrates that gene co- 
expression analysis can effectively select related disease 
candidates and epistatic partners. We therefore used 
evidence from the co-expression analysis and mouse 
behavioral endophenotypes to further prioritize other 
potential candidates. For this, we considered genes in the 
MGS data set that were nominally associated with 
schizophrenia. The study by Shi et al.^^ generated a list of 
1202 SNPs in the European-American cohort and a list of 
997 SNPs in the African-American cohort with association 
P< 0.001. We mined the SNP lists for nearby genes with 
mouse homologues that are among the top covariates of 
Nrxn^ in the mouse hippocampus. Additional functional 
evidence was gathered from knockout studies and gene 
ontology. On the basis of this, we identified 30 genes that are 
nominally associated with schizophrenia and are co- 
expressed with Nrxn1. Several of these are already known 
candidates for neuropsychiatric disorders including 
schizophrenia, autism, attention deficit hyperactivity disorder 
(ADHD), and bipolar disorder (Table 2).^°"^^ Strikingly, 10 out 
of the 30 genes are also nominally associated with substance 
dependence and nicotine addiction.^"^'^^ In the mouse, 
deletion of six of these candidates (CTNND2, DMN, 
GSK3B, NRCAM, PLCB1 and RIMS1) lead to behavioral 
changes related to PPI, social and cognitive functions, 
hyperactivity, fear learning and anxiety-related traits.^^"^^ 

Among the novel genes in Table 2, we highlight Pleckstrin 
and Sec7 domain containing 3 (PSD3) as both a potential new 
candidate for schizophrenia and a candidate for the trans- 
effect associated with the co-expressed transcripts in the 
mouse brain. This gene has high expression in the brain and is 
located in mid-Chr 8 within the trans eOTL for the Nrxn1 
exons. Expression of Psd3 is associated with a significant cis 
eOTL in the BXDs and is also strongly correlated with Nrxn1 
(Supplementary Figure S4). 



Discussion 

Synopsis. Nrxn1 is highly polymorphic and is associated 
with significant variation in expression among mice. We have 
exploited this variation to trace co-expression networks and 
eOTLs that link Nrxn1 with other genes and also with higher 
order behavioral phenotypes. GSK3B is among the strongest 
and most consistent covariates of NRXN1 in both mice and 
humans and we detected a significant epistatic interaction 
between these genes that modulates schizophrenia risk. 



Cross-validation 1 0 000 permutation 
consistency P -value 

51.85 10/10 0.0123 

51.86 10/10 0.0373 



This cross-species analysis also highlights novel candidates 
for schizophrenia in human, especially PSD3. 

Sub-transcript level analysis of gene expression 
regulation. For a fine-scale analysis of Nrxn1, we have 
treated each exon, intron and the UTRs as discrete 
expression units that can be individually subjected to 
genetic analysis. This revealed some level of complexity in 
transcript regulation. For instance, exons 8, 1 1 and 21 show 
low correlation with other exons and are also associated with 
distinct eOTLs. Aside from these exceptions, the majority of 
exons have tightly correlated expression, partly due to 
shared modulation by two prominent eOTLs — one a cis 
eOTL at Nrxn1, and the other, a trans eOTL on mid-Chr 8. 

The mid-Chr 8 trans eOTL also modulates several key 
genes correlated with Nrxn1 including Psd3, Gsl<3b, Snap25 
and Kpna3. This interval corresponds to human 8p21, a 
region that contains many genes linked to brain develop- 
mental and psychiatric disorders. Polymorphisms in this 
region in humans may also have frans- regulatory effect on the 
expression of other schizophrenia susceptibility genes. In a 
striking concordance between mouse and human, this part of 
mouse Chr 8 is also a QTL hot spot for brain structural traits in 
the BXDs and is linked to variations in the volume of the 
amygdala, hippocampus, cerebellum and striatum. ^^'^^ Lo- 
cated in this mid-Chr 8 QTL is Psd3, a gene that is modulated 
by a cis eOTL and is also co-expressed with Nrxn1. In the 
human MGS data set, SNPs near PSD3 are nominally 
associated with schizophrenia (P= 0.00007). Although little 
is currently known about the function of PSD3, the converging 
evidence from multiple sources makes this gene a high- 
priority candidate for the eOTL in mouse and a novel 
candidate for schizophrenia in human. 

Variation in Nrxn1 expression may influence 
schizophrenia endophenotypes. In the BXDs, PPI to 
acoustic startle maps to distal Chr 17 close to the Nrxn1 
locus. PPI is a classic endophenotype that models sensory 
gating deficits in schizophrenia patients,^^ and polymorphisms 
in Nrxn1 are strong candidates with compelling face 
validity. Consistent with association to schizophrenia, 
deletion of Nrxnia in mice results in decreased PPI.^^ 
However, in this knockout, social behavior and anxiety traits 
are unaffected .^^ 

Unlike studies using gene knockouts, we have evaluated 
naturally occurring variants that will generally have more 
subtle effects on phenotypes. Most of the SNPs and insertions 
and deletions in Nrxn1 are intronic and are unlikely to cause 
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gene disruption. However, these more moderate variants 
model the type of variation seen in human populations. A 
detailed analysis of structural variations in the NRXN genes in 
human has found a range of deletions and duplications.^^ The 
deletions in NRXN1, particularly those that span exons, are 
strongly associated with schizophrenia. Additionally, there are 
possible associations with other psychiatric conditions and 
alcoholism that indicate pleiotropic effects of NRXN1.^^ This is 
consistent with our findings in mouse in which QTLs for 
ethanol response and anxiety map near Nrxn1. 



Potential functional and epistatic partners: Nrxn1 and 
Gsk3b. Hundreds of susceptibility variants have been 
identified for schizophrenia in recent large-scale genomic 
studies. Associated genes fall into diverse functional 
categories and this supports the impressive genetic 
heterogeneity of complex neuropsychiatric diseases.^^'^'' 
The approach we have taken here is to use a population 
model and exploit endogenous variations in a high-priority 
candidate as a starting point for a systematic gene-to- 
phenotype analysis. NRXN1 has been repeatedly linked to 
schizophrenia and autism''^"'' ^ and as we show, a co- 
expression network primed by this gene captures other 
genes that work closely with Nrxn1 and are involved in 
disease pathway. 

Co-expression and eQTL analyses led us to genes already 
linked to Nrxn1 (for example. Cask, Nlgn1, y-aminobutyric 
acid and glutamate receptors), and also led us to novel genes 
that have less obvious, but potentially interesting links to 
Nrxn1. Gsk3b is one such gene that we detected and then 
validated as a novel genetic partner of Nrxn1. GSK3B has 
diverse functions in the cell, including synaptic and intracel- 
lular signal transduction^^"^^ and could therefore be a conduit 
between the synaptic role of NRXN1 and cell signaling. 
Supporting this hypothesis, we found significant genetic 
interactions between SNPs in NRXN1 and GSK3B that 
indicate that susceptibility to schizophrenia can be modified 
to an extent by the combination of alleles in the two genes. For 
example, in the 2-loci model, the AA genotype at rs4563262 
(NRXN1) and the AG and GG genotypes at rs4340737 
(GSK3B) interact to increase or decrease vulnerability. 

Despite the genetic interactions, to our knowledge, no study 
has yet demonstrated direct physical link between NRXN1 
and GSK3B. However, the presynaptic region is a potential 
locus of cellular interaction between the two proteins. GSK3B, 
in addition to its many functions in cell signaling, also 
modulates presynaptic vesicle release by phosphorylating 
the voltage-dependent calcium channels and reducing in- 
tracellular Ca^+ levels.^^ It also regulates the development of 
the active zone — the site of vesicle fusion and neurotransmit- 
ter release. NRXN1, a predominantly presynaptic protein, 
modulates release by selectively acting on voltage-dependent 
calcium channel (P/Q and N type), possibly by modifying the 
coupling of Ga^+ channels to release-ready vesicles.''^ 
Variants of NRXN1 and GSK3B may therefore have a 
collective influence on the synaptic environment and conse- 
quently on disease susceptibility. Both the genetic interaction 
and a potential molecular interaction are consistent with a 
model in which neurotransmission and intracellular signaling 
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are coupled in the etiology of schizophrenia and other 
psychiatric disorders. 
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